33 research outputs found
Towards a Near-real-time Protocol Tunneling Detector based on Machine Learning Techniques
In the very last years, cybersecurity attacks have increased at an
unprecedented pace, becoming ever more sophisticated and costly. Their impact
has involved both private/public companies and critical infrastructures. At the
same time, due to the COVID-19 pandemic, the security perimeters of many
organizations expanded, causing an increase of the attack surface exploitable
by threat actors through malware and phishing attacks. Given these factors, it
is of primary importance to monitor the security perimeter and the events
occurring in the monitored network, according to a tested security strategy of
detection and response. In this paper, we present a protocol tunneling detector
prototype which inspects, in near real time, a company's network traffic using
machine learning techniques. Indeed, tunneling attacks allow malicious actors
to maximize the time in which their activity remains undetected. The detector
monitors unencrypted network flows and extracts features to detect possible
occurring attacks and anomalies, by combining machine learning and deep
learning. The proposed module can be embedded in any network security
monitoring platform able to provide network flow information along with its
metadata. The detection capabilities of the implemented prototype have been
tested both on benign and malicious datasets. Results show 97.1% overall
accuracy and an F1-score equals to 95.6%.Comment: 12 pages, 4 figures, 4 table
Sentic LDA: improving on LDA with semantic similarity for aspect-based sentiment analysis
The advent of the Social Web has provided netizens with new tools for creating and sharing, in a time- and cost-efficient way, their contents, ideas, and opinions with virtually the millions of people connected to the World Wide Web. This huge amount of information, however, is mainly unstructured as specifically produced for human consumption and, hence, it is not directly machine-processable. In order to enable a more efficient passage from unstructured information to structured data, aspect-based opinion mining models the relations between opinion targets contained in a document and the polarity values associated with these. Because aspects are often implicit, however, spotting them and calculating their respective polarity is an extremely difficult task, which is closer to natural language understanding rather than natural language processing. To this end, Sentic LDA exploits common-sense reasoning to shift LDA clustering from a syntactic to a semantic level. Rather than looking at word co-occurrence frequencies, Sentic LDA leverages on the semantics associated with words and multi-word expressions to improve clustering and, hence, outperform state-of-the-art techniques for aspect extraction
A Learning Scheme Based on Similarity Functions for Affective Common-Sense Reasoning
This paper explores the theory of learning with
similarity functions in the context of common-sense reasoning and
natural language processing. Based on this theory, the proposed
approach (called Sim-Predictor) is characterized by the process
of remapping the input space into a new space which is able to
convey the similarity between the input pattern and a number
of landmarks, i.e., a subset of patterns randomly extracted from
the training set. The new learning scheme exhibits the interesting
property of relating the dimensionality of the remapped space
to the learning abilities of the eventual predictor in a formal
fashion. The evaluation phase shows that Sim-Predictor compares
positively with ELM and SVM, when addressing the problem of
polarity detection in the sentic computing framework, a novel
approach to big social data analysis based on the interpretation
of the cognitive and affective information associated with natural
language (affective common-sense reasoning)
An ELM-based model for affective analogical reasoning
Between the dawn of the Internet through year 2003, there were just a few dozens exabytes of information on the Web. Today, that much information is created weekly. The opportunity to capture the opinions of the general public about social events, political movements, company strategies, marketing campaigns, and product preferences has raised increasing interest both in the scientific community, for the exciting open challenges, and in the business world, for the remarkable fallouts in marketing and financial prediction. Keeping up with the ever-growing amount of unstructured information on the Web, however, is a formidable task and requires fast and efficient models for opinion mining. In this paper, we explore how the high generalization performance, low computational complexity, and fast learning speed of extreme learning machines can be exploited to perform analogical reasoning in a vector space model of affective common-sense knowledge. In particular, by enabling a fast reconfiguration of such a vector space, extreme learning machines allow the polarity associated with natural language concepts to be calculated in a more dynamic and accurate way and, hence, perform better concept-level sentiment analysis
Machine Learning Techniques applied to Twitter Spammers Detection
Every minute more than 320 new accounts are created on Twitter and more than 98,000 tweets are
posted. Among the multitude of Twitter users, spammers and cybercriminals aim to pervade and strike
legitimate users' accounts with a large amount of troublesome messages. Hence, the Social Network
propagation opens new modalities for cyber-crime perpetration, while the spamming phenomenon exploits
specific mechanism of messaging process. This research shows that Machine Learning (ML) may provide a
powerful tool to support spammer detection in Twitter. The present paper compares the performance of three
different ML algorithm in tackling this task. The experimental session involves a publicly available dataset